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1 EXECUTIVE SUMMARY 


This study describes the patterns of productivity growth across eighteen industries. We examine 
the components of this productivity growth by estimating the contribution of entry, exit, within- 


firm growth and re-allocation to productivity growth in Australia in the period 2002-2013. 


We use an experimental linked dataset of 10 million workers across 1.5 million firms. We produce 
industry-level estimates using firm-level data across 18 industries. We estimate worker- and firm- 


specific effects using a grouping algorithm appropriate for sparse matrices. 


We find that firm entry and exit are by far the largest contributors to productivity growth across 
all industries. In general, firm exit contributes positively to productivity growth whereas firm 
entry generally contributes negatively. This would suggest that policies which facilitate firm 
entry and exit are likely to help achieve increased productivity gains. Policies which provide 


large advantages to incumbent firms are likely to detract from productivity growth. 
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2 ABSTRACT 


We estimate the contribution of entry, exit, within-firm growth and re-allocation to productivity 
growth in Australia in the period 2002-2013. We use an experimental linked dataset of 10 mil- 
lion workers across 1.5 million firms. We produce industry-level estimates using firm-level data 
across 18 industries. We estimate worker- and firm-specific effects using a grouping algorithm 
appropriate for sparse matrices. Firm entry and exit are by far the largest contributors to pro- 
ductivity growth across all industries. In general, firm exit contributes positively to productivity 


growth whereas firm entry generally contributes negatively. 


Disclaimer: the results of these studies are based, in part, on tax data supplied by the Australian Taxation 
Office (ATO) to the ABS under the Taxation Administration Act 1953, which requires that such data is only 
used for the purpose of administering the Census and Statistics Act 1905. Legislative requirements to ensure 
privacy and secrecy of this data have been adhered to. In accordance with the Census and Statistics Act 1905, 
results have been confidentialised to ensure that they are not likely to enable identification of a particular person 
or organisation. This study uses a strict access control protocol and only a current ABS officer has access to the 
underlying microdata. 

Any findings from this paper are not official statistics and the opinions and conclusions expressed in this paper 
are those of the authors. The ABS takes no responsibility for any omissions or errors in the information contained 
here. Views expressed in this paper are those of the authors and do not necessarily represent those of the ABS. 


Where quoted or used, they should be attributed clearly to the authors. 
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3 INTRODUCTION 


Firms are living beings; they are born, some grow to maturity, and all eventually die. 
Child mortality is high, the few who survive grow rapidly, but only a handful enjoy old 
age. New entrants are smaller and less productive on average but more diverse than 
continuing firms. Although size and productivity diversity diminish over time due to the 
selection process responsible for early death, differences in factor productivity of those that 
continue are still large and persistent. 

Lentz and Mortensen (2010, p.2) 


As Lentz and Mortensen (2010) point out, an efficient market allocates resources from less 
productive firms to more productive ones. Firm dynamics—that is, how contributions from es- 
tablished, entering and exiting firms affect aggregate productivity—is one of the key microdrivers 


that influence aggregate productivity (Foster et al., 2001). 


The seminal surveys by Bartelsman and Doms (2000) and Syverson (2011) discuss the ad- 
vantages of using microdata to better understand the determinants of aggregate productivity. 
Aggregate statistics, which give a good overview of trends in productivity growth, do not show 
the variability that occurs at micro levels. It is important to develop a good understanding of 
the degree to which different aspects of productivity growth within and across firms contribute 


to different productivity growth across industries. 


This study describes the patterns of productivity growth across eighteen industries. We examine 
the components of this productivity growth by looking at firm entry and exit, reallocation across 
continuing firms and productivity growth within firms. We also examine whether these patterns 


differ across industries? 


Our industry level results are decomposed into contributions from surviving, entering and ex- 
iting firms. We apply linear models, estimated separately by industry, using a Cobb Douglas 
production function as the basis to estimate firm level productivity. Previous studies have shown 
the importance of correcting for endogeneity in estimating productivity due to strong correlation 
between inputs and outputs in the production process. We adapt the approaches of Abowd et al. 


(2002) and Mare et al. (2017) to estimate labour inputs which we use to address endogeneity. 


This paper is structured as follows: Section 4 provides the literature review, Section 5 describes 
the scope of the data and Section 6 presents the statistical models. Section 7 discusses estimation 
methods and how we ensure unique identification of the estimated indicator variables. Section 8 
contains empirical results. The final section gives some conclusions and future directions for 


further research. 
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4 LITERATURE REVIEW 


Developing a good understanding of the determinants of aggregate productivity is challenging 
because the economy is complex. One factor in aggregate productivity growth is the reallocation 
of resources from more productive firms to less productive ones. Part of this effect is captured 
by firm entry and exit. Several studies describe the role of the reallocation of resources between 
firms. Influential work by Olley and Pakes (1996) and Bartelsman and Dhrymes (1998) developed 
the principal methods that most economists use to measure the impact of firm dynamics on 
aggregate productivity. These methods are often used in analyses to better understand the 
process of creative destruction that can occur within and between sectors of the economy (Foster 
et al., 2001). 


Lafrance and Baldwin (2011) explored the contribution firm turnover has on productivity growth 
in the Canadian services industries. They found that the market naturally allocates resources 
from uncompetitive firms to new entrants. Nguyen and Hansell (2014) explored the firm dy- 
namic effects on productivity growth for Australian manufacturing and business services indus- 
tries. They have found that entering and exiting firms make smaller contributions to overall 


productivity than established firms. 


Economists also consider productivity differences to come from better measures of inputs used 
in the production process. Labour economists have observed strong correlations between the 
differences in firm productivity and wage costs per worker (Lentz and Mortensen, 2010). How- 
ever, this strong correlation can potentially cause endogeneity (Fox and Smeets, 2011). Better 
labour quality measures for production are important to minimise endogeneity in productivity 
analysis (Foster et al., 2001). 


This study explores the effects of firm dynamics on aggregate productivity by adapting approach 
of Mare et al. (2017). The labour component is estimated using the approach of Abowd et al. 
(2002) which takes into account two-sided worker and firm effects. This estimated labour com- 
ponent is then used in a firm production function equation. The contributions to the aggregate 
industry productivity are derived using the approaches of Griliches and Regev (1995) and Melitz 


and Polanec (2015) to take into account firm dynamics. 
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5 DATA DESCRIPTION 


The Australian Taxation Office (ATO), Australia Business Register (ABR) and ABS datasets are 
held in both the Business Longitudinal Analytical Data Environment (BLADE) for firms (ABS 
and DIIS, 2017) and the prototype Graphically Linked Information Discovery Environment for 
workers (Chien and Mayer, 2015a). This section describes the ABS confidentiality protocol and 
the data processing carried out for this study. The sample period is between 2002-03 to 2012-13. 


5.1 Data confidentiality 


The ATO data is provided to the Australian Statistician under the Taxation Administration 
Act 1953 and (ABR) data is supplied to the Australian Statistician under A New Tax System 
(Australian Business Number) Act 1999. These Acts require that these data are only used by the 
ABS for administering the Census and Statistics Act 1905. The ABS is obliged to maintain the 
confidentiality of individuals and businesses in these ATO and ABR datasets, as well as comply 
with provisions that govern the use and release of this information, including the Privacy Act 
1988 ABS (2015). 


This study uses a strict access control protocol. Access to the datasets includes audit trails 
and is limited on a need to know basis. All ABS officers are legally bound to secrecy under 
the Census and Statistics Act 1905. Officers sign an undertaking of fidelity and secrecy to 
ensure that they are aware of their responsibilities. The ABS policies and guidelines govern the 
disclosure of information to maintain the confidentiality of individuals and organisations. This 
study presents only aggregate results to ensure that they are not likely to enable identification 


of a worker or a firm. 
5.2 Data processing 


Our experimental worker panel uses data from the ABS prototype Graphically Linked Infor- 
mation Discovery Environment (GLIDE) (Chien and Mayer, 2015a). The worker panel has 
130, 281, 096 observations containing 1,903,015 Australian Business Numbers (ABNs) for firms 
and 13,131,074 de-identified and encoded Tax File Numbers (DETFNs) for workers. We only 
include workers whose age is between (16, 65] in the years between 2001—02 and 2012-13. Worker 
characteristics such as age, sex and occupation come from Personal Income Tax (PIT) filings 
and wage information comes from Pay-as-You-Go (PAYG) summaries. PAYG contains a longer 
time series than PIT, so this study backcasts the PIT data to the same length. The earliest 
available PIT information is used to backcast sex (holding it constant) and age (by subtracting 
1 year). Two methods to backcast the skill categories for workers were explored: either using 
the average or holding it constant for each worker. This study found that it was not appropriate 
to use average skill because workers tend to become more skilled over time, so using average 
skill inflates the worker’s skill level over the backcast period. The ABS’s Australian and New 
Zealand Standard Classification of Occupations is used to convert occupations into a 5-point 
skills categorical variable for the analysis (ABS, 2009). We stress that the prototype worker 


panel data is constructed for research purposes only. 


The experimental worker panel is aggregated to the firm level to derive worker-level variables 
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for each ABN. The employee counts come from PAYG records, which contain both ABNs and 
DETFNs. We count the total number of DETFNs for each ABN in each year. One key chal- 
lenge in analysing integrated administrative datasets is the different coverage of the firms. For 
example, at the firm level, there is only a 6% difference in scope between PAYG and PIT files 
between 2001-02 and 2012-13. This difference in scope can be caused by the timing of the 


processing or including only working age workers. 


The information on firm industry classification comes from both the ATO and the ABS Business 
register (ABSBR). A majority of firms have valid industry classification. The industry classifi- 
cations for these ABNs do not change over the sample period between 2001-02 and 2012-13. We 
impute industry classifications for 45,961 ABNs using the method discussed in Section A. We 
ensure that the imputed industry classifications also do not change for the re-entered firms (e.g., 
firms drop in and out due to processing errors or late processing). This is important to minimise 
bias in decomposition analysis at the industry level. We also use the following heuristic rules for 
the data processing. First, 335 ABNs have missing or invalid year of incorporation variable and 
a majority of these firms are in 2001-02. We assume that these firms are incorporated in the 
year when the ABN was first introduced in 2000-01. Secondly, the information on firm entry 
and exit is from both ABS ABSBR for ABNs in year 2001—02 only and our derivation for ABNs 
between 2002-03 and 2012-13. We do not classify re-entered firms as entry firms or exit firms 
during the missing spells. For example, firm A has observations for 2001—02, 2003-04, 2004-05 
and 2005-06. Firm A is classified as an entry firm in 2001-02, a continuing firm in 2003-04 and 
2004-05 and an exit firm in 2005-06. We also use SAS proc expand procedure to longitudinally 


interpolate multifactor productivity and industry weights for these re-entering firms. 


Figure 11 in Appendix F' shows the firm entry and exit rates over the sample period. It is 
interesting to note that the exit rates were generally lower between 2002-03 and 2009-10 except 
for the finance industry in 2007-08. The higher firm exit rates in the finance industry could 


have been caused by the global financial crisis. 
5.3 Data linking and summary 


The study uses a similar linking strategy to ABS (2015) and Chien and Mayer (2015b) to as- 
semble the developmental firm panel using an experimental BLADE. The firm records were 
deterministically linked using ABNs and worker records were deterministically linked using 
DETFNs. As the linking variable is encrypted, it is not possible to identify individuals in 
the datasets. The experimental BLADE contains firm data sourced from the ATO, the ABR 
and the ABS. The sample period is from 2002-03 to 2012-13. BLADE contains detailed firm 
characteristics data from Business Income Tax (BIT) , Business Activity Statements (BAS) and 
the ABR (Hansell and Rafi, 2018). The experimental firm panel has 43,191,403 observations 
with 6,846,067 ABNs in the sample period between 2001-02 and 2012-13. The firm panel con- 
tains non-employing firms. Most firm-level variables such as firm sales and materials costs etc. 
come from Business Income Tax or Business Activities Statements. We include all firms with 
valid records. This study uses experimental version of BLADE and therefore statistical issues 


discussed here may not exist in the production version of BLADE. Our experimental BLADE 
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contains firm characteristics data. It does not contain any data about worker characteristics 


beyond the number of employees and total wages. 


Tables 16 and 17 in Appendix F show the summary statistics for the firm and worker panels. 
The summary statistics for the worker panel show that we focus on workers who have at least 
2 years in the labour market (aged 17 years and older) and are at most 65 years old. The 
proportion of workers aged between 15 and 16 years is small. The summary statistics for the 
firm panel are broadly consistent with balanced and imputed datasets. This is in line with our 
observation with the correlation analysis in Tables 2 and 3 below. The summary statistics for 


gth 


the firm panel show that, in the sample, the youngest firm is 1 year old and the 99” percentile 


is around 19 years old. It is interesting to note that at the 99°” percentile, the real WAGES 
costs is higher than the logarithm of estimated labour components in real terms. Table 1 shows 
firm sizes and firm years in sample. Large firms, i.e., employee size > 200, are more likely to be 


in the sample for a longer period. 


Table 1: Firm size and years in sample 


years in sample 
Firm size 1 2 3 4 5 6 7 8 9 10 11 12 ‘Total 
1to4 3.9 49 5.7 71 52 53 51 47 48 50 60 87 66.3 
5 to 19 02 05 09 18 14 16 1.7 17 #18 21 #32 £78 24.7 
20to199 0.0 O01 O1 05 04 04 05 05 05 O07 10 £3.46 8.3 
200 plus 0.0 00 00 00 00 0.00 00 00 00 00 O1 0.5 0.7 
Total 41 55 68 95 7.0 7.3 7.3 68 7.1 7.8 10.3 20.6 100.0 


5.4 Missing data 


We use an unbalanced panel of firms. Figure 1 shows the missing data pattern for the ABS 


experimental data used in the sample. 


Figure 1: Missing data patterns in experimental datasets for all firms with > 1 employee 
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Note. In each subfigure, the left panel is a bar chart showing the proportion of missing data for each variable. The right 
panel shows the missing data patterns and the proportion of each pattern; a green tile indicates missing data; a blue 
tile indicates non-missing data. The left panel is a bar chart showing the proportion of missing data for each variable. 
The right panel shows the missing data patterns in the data and the proportion of each pattern. These proportions are 
scaled to increase the readability of the plot (Templ et al., 2012). The variables InZ, Ink, Iny, InM and InFirm_Age 
are the logarithms of labour for firms, capital for firms, sales for firms, materials used for production and firm age, 
respectively. The number of employees is Employees and the industry classification for firms is Industry. 
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Dropping firms where some variables are missing results in a dramatic reduction in sample size. 


Therefore we assume missing at random and impute missing variables for non-missing firms using 


a sequential regression approach in SAS, namely the proc mi procedure (see Appendix B). We 


create 10 imputed data sets upon which we base our estimation. We then reproduce this analysis 


10 times and we select the results which maximise the likelihood function for the firm-level 


productivity model (3) below from the 10 imputations. Tables 2 and 3 compare the correlation 


coefficients for the variables of interest. These tables show that the correlation coefficients are 


consistent when we compare between complete cases and imputed datasets. 


Table 2: Pearson Correlation Coefficients -- balanced dataset 


Inga Inky. MM; InFirm_Age nz? ") WAGES 
ain T 0.5206 0.58389 0.13821 0.01096 0.72011 
Iny jp <.0001  <.0001 <.0001 <.0001 <.0001 
0.5206 T 0.4829 0.10082 -0.12324 0.50686 
IDK jie <.0001 <.0001 <.0001  <.0001 <.0001 
0.58389 0.4829 I 0.12111 0.02614 0.62887 
nM jit <.0001  <.0001 <.0001  <.0001 <.0001 
0.13821 0.10082 0.12111 1 -0.42234 0.15778 
InFirm_Age =<. 9001_ = <.0001 + <.0001 <.0001  <.0001 
a 0.01096 -0.12324 0.02614 -0.42234 1 -0.0151 
nz, <.0001  <.0001 <.0001 <.0001 <.0001 
0.72011 0.50686 0.62887 0.15778 _-0.01508 I 

WAGES <.0001  <.0001 <.0001 <.0001  <.0001 


Table 3: Pearson Correlation Coefficients -- imputed dataset 


Iny ‘ is Inky. MM; ImnFirm_Age nz, ') “WAGES 
ae 1 0.53644 0.55506 0.13707 0.0238 ~0.7391 
Iny ie <.0001  <.0001 <.0001  <.0001 <.0001 
0.53644 T 0.4823 0.10351 -0.11546 0.5301 
IDK jx <.0001 <.0001 <.0001 <.0001 <.0001 
0.55506 0.4823 I 0.07288 0.0215 0.5616 
nM jx <.0001  <.0001 <.0001  <.0001 <.0001 
0.13707 0.10351 0.07288 1 -0.42192 0.137 
InFim_Age = <.9001~— <.0001.— <.0001 <.0001  <.0001 

oe 0.0238 -0.11546 0.0215 -0.42192 I 0.0176 
In2; <.0001  <.0001 <.0001 <.0001 <.0001 
0.73906 0.53006 0.56157 0.137 0.01758 I 

WAGES <.0001  <.0001 <.0001 <.0001  <.0001 


Results from our imputation approach match ABS results more closely than those where we 


drop all firms with any missing values. The analysis of the complete case data, which involves 


dropping 80 per cent of the data, produces a lot of volatility and inconsistency with ABS results 


therefore we prefer the imputation approach. 
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6 STATISTICAL MODELS 


6.1 Worker equation 


This analysis uses a modified wage equation adapted from Abowd et al. (2002). The worker 
panel is unbalanced, meaning that the available observations for each worker 7, i = 1,...,. N can 
be different. Suppose that the observations for worker i are available at time t = 1,...,7;. So 
t = 1 is the first time period and t = T, is the last time period for the available observations for 
worker i. Note that there can be gaps. A worker might appear in periods 1 and 3 but not in 


period 2, for example. We model y;,, the wages for worker i at time t, as 


In(y4) = x},a sO oF Fla) + Exe, (1) 


where x;, is a p-vector of characteristics of worker 7 at time t, a is a p-vector of unknown 
coefficients of the worker characteristics, 0; represents unobserved (time-invariant) worker effects, 
the components of the J-vector W = (q,,--,W ae represent firm effects (e.g. specific factors such 
yt 


as pay structure that affect workers’ wages), f), = (fi14,°°, f,j,)| is a firm indicator vector with 


components 


F 1, if worker 2 works for firm 7 at time t 
- 0, otherwise, 


and the random disturbances €;, are assumed to satisfy €,, tN (0,07). 


It is convenient to write the term x}, which describes worker characteristics in Wilkinson 
and Rogers (1973) notation as Sex + HighSkill + MediumSkill + WorkingSkill + Time + 
Poly(Age, 4) + Sex : Poly(Age, 4) + Sex : Time. Here the indicator Sex = 1 if worker 7 is male 
and 0 otherwise. The indicator HighSkill = 1 if worker 7 has a tertiary qualification and 0 
otherwise. The indicator MediumSkill = 1 if worker 7 has at most a diploma qualification and 
0 otherwise. The indicator WorkingSkill = 1 if worker 7 has at most a certificate IIT qualifica- 
tion and 0 otherwise. Workers with qualifications lower than a certificate III qualification are 
treated as the baseline and included in the intercept. The variable Time is represented by 11 
time indicator variables, one for each year with 2001 — 02 as baseline. The variable Age, the age 
of worker 7 at time t,,, is fitted by a quartic polynomial including linear, quadratic, cubic and 
quartic functions. We include a quartic function to better describe the data because fitting only 
quadratic and cubic terms does not describe the decline in workers’ wage as they get older. We 
include the interaction terms Sex : Poly(Age, 4) between Sex and Age and Sex : Time between 


Sex and Time. This makes each xj,a a sum of p = 34 terms. 


Following Mare et al. (2017), we estimate (1), pooling across all workers at all time periods in 
all industries. We then derive an instrument for firm-specific labour inputs, which we use in 


(3) below, based upon the average fitted values for each firm j. Specifically, let a, 6,, a) ny and 
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w denote estimates of parameters in (1). We use a two stage least square approach to derive 
our instrumental variable (Wooldridge, 2006). The estimated person and firm effects from 
(1) are correlated with firm productivity because firms with higher quality workers or better 
management practices are likely to be more productive. The equation to derive the proposed 


instrumental variable is: 


yl) = x@ ++ el) where (2) 


N N 
where uy” = yy Fije¥ie and x," = ye, Sijt%it> 
i=1 i=1 


Note that the variables in (2) now have a firm superscript, j, to reflect the averaging of worker 
effects within each firm j. We use Pe to denote the instrumental variable derived as the predicted 
value from (2). When we want to emphasise below that firm 7 belongs to industry k, we also 
include the industry superscript k so that the estimated firm-average worker effect (the average 


effect of a worker in each firm) gi from (2) becomes 3 Bh, 


6.2 Firm level productivity model 


The firm volume outputs can be modelled as functions of the observed inputs such as capital, 
materials and labour in volume terms, and unobserved components in the production process 
(Fox and Smeets, 2011). We use a Cobb-Douglas production function, similar to Breunig and 
Wong (2008) and Mare et al. (2017), to model ae the outputs, (i.e., sales adjusted for 
repurchase of stock) deflated by industry gross value added implicit price deflators by firm 7 in 
industry k at time t (ABS, 2018a), as: 

Iny\rr = Bet Pylon L je + Poy K je + P3~l2M jet (3) 


Ba lnFirm_Age ire + Tee + EjKt» 


where InL,,, is the logarithm of labour inputs deflated by Wage Price Index and Ink jx, is the 
logarithm of the cost of capital, which includes depreciation, capital rental expenses and capital 
work deductions, deflated by the industry consumption of fixed capital implicit price deflators 
(ABS, 2018a). The logarithm of material costs InM;,, is the inputs used in the production, 
deflated by Producer Price Indexes: Intermediate Goods (ABS, 2018b; also see the ‘List of 
Symbols and Variables’ for information). The logarithm of firm age is nFirm_Age,,,. We 
also include different intercepts 6, for each industry and time-fixed effects 7,,. The multifactor 
productivity term €,,, is assumed to satisfy € jj, ON (0,02) to estimate unbiased coefficients 
for the Cobb-Douglas production function Zellner et al. (1966). 


Endogeneity causes bias in estimating the production function (3). To mitigate the bias, many 
studies use predicted values from instrumental variables equations—that is, using lagged inputs 


as instruments for the current inputs (Gandhi et al., 2011). For example, Olley and Pakes (1996) 
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and Breunig and Wong (2008) use lagged capital investment, Levinsohn and Petrin (2003) and 
Bakhtiari (2015) use lagged material inputs, and Fox and Smeets (2011) use lagged wage costs 
as instrumental variables. However, Reed (2015) cautions against the use of lagged instrumental 
variables to correct for the simultaneity bias. Our labour component comes from the estimated 
instrument, Be R) from (2). We remove components in (1) that can correlate with firm multifactor 


productivity €,,, in (3). 


We fit separate models for each industry, we include k to emphasise the nesting of firm 7 in 
industry k. This specification restricts the same production technology within industries but 
varies across industries to allow each firm to have an individual productivity component. The 
model fitted to the data is: 


arm o(jk 
Iny Ms By, + Bee fe Boll jee + By eh. ji¢+ 


ByylnFirm _Age jx, + The + EjKe- 


(4) 


The estimated parameters for (4) include the industry intercepts Brs labour inputs By ,» cost of 


capital Bos materials costs Bes, firm age pa time-fixed effects 7,, and multifactor productivity: 
wa , = 5 A(jk a ~ a : “a 
Eikt = Iny ir” me (2) tea Bip ee 4 Bol je + B34l0M jae + Ba,nFirm_Age iret + Tre) 
Firm productivity is the ratio of output to measured inputs normalised relative to industry & 
mean. The firm productivity is defined as: 
M fP je = Tee + Ejne- (5) 


We specify the pooled production function regression to calculate industry weights and aggregate 


the contributions of firms to industry multi-factor productivity growth as 


arm A(jk 
Iny\f _ B22 Tale Boln kK jx¢ + 830M jp¢+ 


(pooled) 


(6) 
BylnFirm Ages, + Ap + T+ Ejeet 


where the industry and year effects (A, and 7,) are estimated as fixed effects. We use (6) to 
define industry weights W,;,, as: 

x Ao gh) 8 a x ; 

Ojne = B,22 yy BoM K jx¢ + B3nM jpg + BylnFirm_Age jxe- (7) 


6.3 Industry productivity 


We follow Mare et al. (2017) and define the aggregate productivity index A,, for an industry k 


at time t as: 
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Ay = Wie M FP je (8) 


and J,, is the number of firms in industry k at time t. The multi-factor productivity term 
M FP int and industry weights %@,,,, are defined in the previous section. Note that the weights 


Tit x 


Wi, Satisfy ja Ope = 1 for each industry k and time t. 


Next, aggregating to industry level, let W,, = ay W jx¢ and M fox, = Set M fP jes: Then the 


aggregate productivity index A,, for all industries at time f, is: 
A, = WM f Pre, (9) 


Wrt 
Kia 
k=1 Wet 


oe 
where wy = 


and K, is the number of industries at time t. Note the weights Ww) satisfy yan wr, = 1 for 


each time t. 


Griliches and Regev (1995) propose decomposing the changes in aggregate productivity from 


time t—1 to ¢ into contributions from surviving (.S), entering (EN) and exiting (EX) firms as: 


MAg = Wre + Bre + ENx + EX pe, (10) 


where W,, = S> Dp AM fd jp0, 
IES et 
Bu = Ss" AW jn4( MSD 5p — Ax), 
JES ket 


EN = S> Wyre M SP 5% — A,) and 
jCEN ge 


EX x = o Wyre1(M FP ja a A,)- 
GEEX jt 


The symbol A represents changes, so A.A;, = A,,—Aj,,_1 is the change in aggregate productivity 


for industry k from time t — 1 to time t. Bars represent averages between ¢ and t — 1, so 


A (Wjret+@jnt-1) TD. — (AnetAne-1) 
Va en A 


and exiting firms 7 € #.X,, are based on firm transitions on an annual basis over the observed 


. The definitions of surviving 7 € S;,,, entering 7 © EN,, 


sample period. Survivors are firms operating in t and t — 1, exiting firms are firms that exist at 


time t — 1 but not at time ¢t and entering firms are firms that did not exist at time t — 1 but 
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did at time t. The contribution of the surviving firms is decomposed into two components: the 
within-industry reallocation W;,,, which measures the change in firm productivity weighted by 
the average of the weights at t and t — 1 (i-e., @n.) and the between-industry reallocation B,,, 
which measures deviations from the average productivity (i.e., A;,) including the impact of firm 
entry and exit (Foster et al., 2001). 


Fox and Smeets (2011) discuss the importance of using appropriate benchmarks to calculate 
the contributions of surviving, entering and exiting firms to aggregate productivity. This study 
uses an alternative method proposed by Melitz and Polanec (2015). Their dynamic Olley— 
Pakes decomposition incorporates a decomposition proposed by Olley and Pakes (1996), which 
captures the covariance of productivity changes and market share of an individual firm over 


time. Let J denote J, (i-e., the number of firms in industry k at time t) and the equation is: 


Si 
pj 2 @ Dype — Wee) (M PP jxe —M FP yy) (11) 


esr 


= M fo,, ar Cov(W xe; M fP jet), 


Ai = M fry, + 


= Xo, LyneM FP jn a pee Gre 
where Mfp,, = ji an Oy = | with 
am Lice pas Lice 


: ‘ if firm 7 operates in industry k at time t 
kt 


0, otherwise. 


The dynamic Olley—Pakes approach decomposes aggregate productivity into contributions from 


surviving, entering and exiting firms as: 


AAj, == Wee Br + EN sb EX ge (12) 


where Wi, =AP,,,. Bi = ACov,,; 


Gato a 
ENi = 5 Wyrt(Ajeteen -_ A jrtes) and 
jE EN 


ys e2 
EXiy = 5 W yrt(Ajntenx = Ajnt—168): 
jEEX 


The dynamic Olley—Pakes decomposition approach uses more appropriate benchmarks for the 
entering and exiting firms (see the discussion in Section 8). For example, entering firms only 
generate positive growth when they have higher productivity than surviving firms at time t. 
Similarly, exiting firms can only generate a positive contribution if they have lower productivity 
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than surviving firms at time ¢t — 1. 
7 ESTIMATION METHODS 


7.1 Data structure 


The model in (1) can be written as a model for each worker 7 by stacking the observations over 


time. We obtain 


y;, = Xa+10,+F w+ «, (13) 
T,x1 T, x34 T; x1 T,xJ T,x1 
T 7 
Vit, Xit, 1 a fits 
where yi , X= p dg : FP; = » = 
T T 
Yiter, Xitir, 1 it, “iter, 


The model for the whole sample can be written in matrix form as 


y= Xa+P6+Fyt+e, (14) 
N*x1 N*xp N*xN Nxt N*x J 
Y1 Xx, 1 0, F, 
where y = : , x= : s- PS , , O= ph : ; 
LYN Xy 0 ly On Fy 
N*x1 
€ N 
e=| : |and N*¥= yo T;, is the total number of observations. 
i=l 
L €n 


7.2 Preconditioned conjugate gradient algorithm 


Abowd et al. (1999) highlighted the challenges of fitting model (1) due to the large number 
of workers and firms. The US study contains N > 1 million workers and J > 50,000 firms. 
The Australian prototype dataset contains N > 10 million workers and J > 1.5 million firms 
for around 130 million observations over eleven years for the worker equation (1). This study 
uses the direct estimation methodology proposed by Abowd et al. (2002) which involves first 
solving a large sparse linear system with a preconditioned conjugate gradient algorithm, and 
then imposing constraints on the parameters to identify unique worker and firm effects. The 


conjugate gradient algorithm solves the sparse linear system AZ = c, where A is a symmetric 
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positive definite matrix, G is an unknown vector and c is a known vector. For ordinary least 


square estimation of parameters in (1), the system is defined with 


X'X X'P X'F a Xly 
A= (Pix. PP PrP), p= 0) and c=" Ply. (15) 
F'X F'P F'F w Fly 


Since A is a large, sparse matrix, iterative methods like the conjugate gradient algorithm perform 
better if we transform A to improve its condition number (Shewchuk, 1994). There are many 
options for creating a preconditioning matrix, including incomplete Cholesky factorisation or 
diagonal preconditioning which uses a diagonal matrix whose diagonal entries are identical to 
the diagional elements of A (see Song (2013) for a review). The preconditioning matrix used in 


the algorithm is a variant of incomplete Cholesky factorisation. Let 


ZO 0 
U=|0 PP? 9O |, 
0 0 FP? 


where Z is the upper triangular matrix obtained from the Cholesky decomposition of K'X, 
P'/? is the diagonal matrix with the square roots of the diagonal terms of P'P on the diagonal 
and F!/? is the diagonal matrix with the square roots of the diagonal terms of F'F on the 


diagonal. Following Fasshauer (2007), rewrite the system as 


AB =6, 
if Z'X'PP!/2 ZX 'FF!/2 
where A-—U-'AU! = |P?2P'xXzZ I P-12Pp I PF!/2 
FU?EIXZ! Fe pp! "a 


B=U-'B and €=U-'e. 


The preconditioned conjugate gradient algorithm used in this study was developed by Dongarra 
(1991) and implemented in Fortran (see Algorithm 1). Let (/) denote the current and (k + 1) 
the next iteration. The CG method computes Bey) by iterating 
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where @*) is a scalar given by 


pH) TU—lPe) 


a’) ee. Fete rae 
dhl Ad) 


, with f=e—AZ, and 
plkth)l U 
rh) U-1 fF) 


det) — x(kt1) ae (RH) ql) with 6) = 


The basic pseudo code is 


Algorithm 1 preconditioned conjugate gradient algorithm 


1: procedure 

2 compute the preconditioning matrix U 
3: compute A andé 
4 


initial re = @ and let d® = U7! r©) 


5 for k = 1,2,3,--- do 

~k _ #lR)T U1 gl) 

a= dik)T Ad(k) 

Blk+) = Blk) + ah qh) 

gktl) — pk) — ok A Ql) 

5 Ck) (REL) y-1g(kF 1) 

=. Pe) U1 Fle) 

Ath) = get) 4 §(k+1) GlA) 
6: until the difference between Br) and Birth is less than 1077 
ts end procedure 


The convergence criterion of Z < 10~’ that we use is similar to that used by others (e.g., Abowd 


et al., 2002, Hallez et al., 2007). 
7.3 Identification using grouping algorithm 


The preconditioned CG algorithm does not provide a unique solution for the firm and worker 
effects. The solutions depend on the initial values, preconditioning matrices and convergence 
criteria and the implicit constraints used in the algorithm are not necessarily conveniently inter- 
pretable. The implicit constraints require the state equations to be satisfied at each iteration. 
Koopmans (1949), Koopmans et al. (1950) and Fisher (1966) discussed the need to impose 
model constraints to identify the underlying economic relationship in the observed data. This is 
because it is possible for two parametric equations to have the same likelihood function unless 
some restrictions are imposed to uniquely identify parameters. There are an infinite number of 
possible constraints and solutions. Fujikoshi (1993) summarises several possible approaches for 


two-way cross classified unbalanced data. 
7.3.1 Issues in Identification 


We use a simplified version of model (1) in this subsection to illustrate the issues faced in 
imposing appropriate model restrictions on the model for workers’ wages. For simplicity, we 
consider a single fixed t and replace the observable worker characteristics terms x/,a by the 


fixed unknown constant 4. With these simplifications, the model (1) has expectation 
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E{ln(y,)} = w+ 6, + fh = e+O;,+ W;, (16) 


when worker i works for firm 7 at time t. With ¢ fixed, it is convenient to make the dependence 
on j more explicit and, just for this subsection, replace y;, by y;;. We consider a two-way table 
of 5 workers labelled 6; for i = 1,---,5 and 4 firms labelled ~,; for 7 = 1,---,4. If we only have 
one observation in every cell, we can represent the table as shown in figure 3. In practice, we 


often do not have one observation in every cell. A simple example is shown in figure 4. 


We describe the data in figure 3 as balanced and in figure 4 as unbalanced. The saturated 
model, the main effect without interaction model for the balanced data, is given by (16). The 
model matrix (P, F) is given in figure 2(a). The relationships between the columns in the model 


matrix in figure 2(a) are 


5 4 
Bo => 4; (17a) Bo = doy (17b) 
i=1 j=1 


where the sums are interpreted as the sums of the vectors in the columns labelled by pu, the 0; 
and the 7;. These relationships show that the model is over-parameterised with ten parameters 
when only eight are needed so is rank deficient. This means that there are an infinite number 
of solutions that satisfy the ordinary least squares normal equation (1). The simplest way to 
identify unique solutions is by using the corner point constraint to set redundant parameters to 
zero, ie. 05, = 4, = O (Holmes et al., 1997). This is shown in figure 2(b). After imposing the 


corner point constraint, the model is of full rank so the normal equations have a unique solution. 
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(a) No constraints 


Figure 2: Model matrices for balanced two-way table 


Bo|O1 92 83 4 O5\y1 We wW3 Wa 
1/1 00 00/1 0 0 0 
1/100 00/0 1 0 0 
1/100 00/0 0 1 0 
1/1 0000/0 0 0 1 
1/0 10 0 0/1 0 0 0 
1/0 100 0/0 1 0 0 
1/0 1000/0 0 1 O 
1/0 100 0/0 0 0 1 
1/0 010 0/1 0 0 0 
1/0 010 0/0 1 0 0 
1/0 0100/0 0 1 O 
1/0 0100/0 0 0 1 
1/0 00 1 0/1 0 0 0 
1/0 00 10/0 1 0 0 
1/0 00 10/0 0 1 0 
1/0 0010/0 0 0 1 
1/0 0001/1 0 0 0 
1/0 0001/0 1 0 0 
1/0 0001/0 0 1 0 
1/0 0001/0 0 0 1 


Figure 3: Balanced two-way table 


Y. Yo V3 
6,|A A A A 
6,| A A A A 
Be WA® At <A oA 
bisa A LAS SA 
6,| A A A A 


full 


rank 


(b) With corner point constraints 


Bo|A1 92 3 Gay we wW3 
1/1 00 0/1 0 0 
1/100 0/0 1 0 
1/1 00 0;0 0 1 
1/100 0);0 0 0 
1/0 10 0/1 0 0 
1/0 10 0/0 1 0 
1/0 10 0;0 0 1 
1/0 10 0/0 0 0 
1/0 01 0;1 0 0 
1/0 01 0/0 1 0 
1/0 01 0);0 0 1 
1/0 01 0/0 0 0 
1/0 00 1/1 0 0 
1/0 00 1/0 1 0 
1/0 00 1;0 0 1 
1/0 00 1/0 0 0 
1/0 00 0/1 0 0 
1/0 00 0/0 1 0 
1/0 00 0;0 0 1 
1/0 00 0/0 0 0 


Figure 4: Unbalanced two-way table ! 
Yo | v3 Va 
6,;| A A |NA NA 
6,| A A |NA NA 
6,| A A |NA NA 
6,| NA NA| A A 
6,|NA NA| A A 


In comparison, the model matrix for the unbalanced data is shown in figure 5(a). 


As can be 


seen from figure 4, the observation pattern forms two groups. This model is also rank deficient. 


If we apply corner point constraints by setting redundant parameters to zero, i.e. 0; = W, = 0, 


the model matrix is shown in figure 5(b). 


TA = available NA = unavailable 
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Figure 5: Model matrices for unbalanced two-way table 


(a) No constraints (b) With corner point constraints 
Bo}O1 92 83 4 A5\y1 We 3 Wa Bo|A1 92 3 G4}y1 we wW3 
1/1 00 0 0/1 0 0 0 1/1 00 0/0 1 0 
1/1 000 0/0 1 0 0 not 1/0 10 0/1 0 0 
1/0 100 0/1 0 0 0 full 1;/0 10 0)0 1 O 
1/0 100 0/0 1 0 0 rank 1/0 01 0/1 0 0 
1/0 010 0/1 0 0 0 1/0 01 0/0 1 0 
1/0 010 0/0 1 0 0 1/0 0031/0 0 1 
1/0 0010/0 0 1 +0 1;0 00 1;0 0 0 
1/0 0010/0 0 0 1 1/0 00 0;0 0 1 
1/0 0001/0 0 1 +0 1;0 0 0 0}0 0 O 
1/0 0001/0 0 0 1 

\ vi rank full rank 
Bo|A1 O2 Oa\t1 Ws 
1/1 0 0)1 0 
1/1 0 0/;)0 0 
1/0 1 0/1 O 
1/0 10/0 O 
1/0 0 0/;1 O 
1/0 0 0;0 O 
1/0 01/0 1 
1/0 01/0 0O 
1/0 0 0/0 1 
1/0 0 0;)0 O 


(c) Corner point constraints with group structure 


However, the model matrix in figure 5(b) is still singular because 6, + 0, + 63 = w, + Yo. 


SO. CO. OH ee ee 


oO OO Se 


figure 4 shows that the unbalanced data separates into two groups called connected groups 
(Searle, 1987). We need to take the grouping structure into account to identify unique firm and 
worker effects. There are an infinite number of possible constraints to make the model matrix 
of full rank; the particular choice from these is arbitrary. An example is to impose yw. = 0. The 


model matrix for the resulting full rank model is shown in figure 5(c). 


Abowd et al. (2002) recognised the need to find connected groups of workers and firms to set 
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model constraints to analyse linked employer and employee data. Firms and workers can be 
connected by a worker changing jobs or by multiple job holders who work for different firms. 
These connected groups are formed in such a way that no one worker or firm can be included 
in more than one group. figure 6(a) and figure 6(b) show how the algorithm connects firms and 
workers into mutually exclusive groups. The size of the circle represents the size of the firms to 
show that connections can occur between firms of different sizes. An edge connects two firms 
through a worker changing jobs from one firm to the other or holding jobs in both firms. These 
connected groups are mutually exclusive because there are no additional worker movements. See 
Algorithm 2 for details. 


Figure 6: Connected groups 
(a) First connection (b) Groups formed 


Retail 


" t VTeSsIO 
ywessione 
Finance, er rofessional Professional Retail 


Professional, f rofessional ; Mining 
=__ Agriculture . 
Mining Agriculture Retail 
Retails > Agriculturg?rofessional i Professional 
Retail, e : Professional 
(Mining Finance 
ining? Professional — 
Mining M DIeSSlOne 
ie. Ainin F F 
Agriculture® Finance 9 Finance Finance 
e_. 
Finance” Finance 
oe. ; ae rofessione 
Retail Finance Agriculture Retail Professional 
* nariculhatal! Finance 
Agriculture AGUAS 
cm Agriculture Mining 
Professional Finance 


Abowd et al. (2002) proposed a grouping algorithm to create groups of connected workers and 


firms in the data for g = 1,---,G groups (see Algorithm 2). 


Algorithm 2 grouping algorithm 


1: procedure 

2 Order by firm id and then worker id. 

3 for group = 1: assign first firm j to group g = 1, 

4 partitioning step 

5: repeat 

6: add all workers employed by a firm j in group g = 1 to group g = 1. 

7 add all firms that have employed a worker 7 in group g = 1 to group g = 1. 

8 until no more firms or workers can be added to group g = 1. 

9: end partitioning step 

10: for group = 2: V worker i € g = 1 and V firm j ¢ g = 1 assign first firm 7 to g = 2, 
repeat partitioning step and add all workers and firms in group g = 2 to group g = 2. 

11: for group = 3: V worker i € g = 1,2 and V firm j € g = 1, 2 assign first firm j to g = 3, 
repeat partitioning step and add all workers and all firms in group g = 3 to group g = 3. 


12: : 
13: for group = G: V worker i ¢ g = 1,2,---,G—1 and V firm j € g = 1,2,---,G—1 assign first firm 
jtog=G, 
repeat partitioning step and add all workers and all firms in group g = G to group g =G. 
14: until all firms are assigned. 
15: end procedure 


The algorithm divides connected workers and firms into mutually exclusive groups. A group is 


defined as all workers and firms that are connected through some migration of workers between 


ABS - MICRO-DRIVERS OF AGGREGATE PRODUCTIVITY - 1351.0.55.164 22 of 46 


firms in that group, and such that there is no migration of a worker within the group to any 
firm outside the group. The main result is that the ensuing model matrix is of full rank so the 


solutions to the ordinary least squares normal equations are unique. 
8 EMPIRICAL RESULTS 


8.1 Firm dynamics and aggregate productivity 


This study shows the usefulness of firm-level analysis for comparing the contribution that en- 
tering, exiting and surviving firms make to aggregate productivity. These contributions are 
quite different at the industry level. The analytical results can be extended to explore the link 
between the contribution of younger firms and overall growth to inform policies and encourage 


economic growth (see Andrews et al., 2015). 


Figure 7 shows the estimated contributions from surviving, entering and exiting firms to ag- 
gregate productivity using the methods of Griliches and Regev (1995) and Melitz and Polanec 
(2015). Nguyen and Hansell (2014) and Melitz and Polanec (2015) note the importance of 
taking into account the appropriate counterfactual to derive the contributions from surviving, 
entering and exiting firms. We concur, particularly for the results from smaller industries (see 
Appendix C’). The results show that the differences between the methods of Griliches and Regev 
(1995) and Melitz and Polanec (2015) are greater for entering and exiting firms in smaller in- 
dustries. We have also explored the aggregation method proposed by Foster et al. (2001); the 
results are similar to those for the approach of Griliches and Regev (1995). 


Figure 7 shows that our results are broadly consistent with published ABS annual productivity 
measures at the aggregate level. Our analysis provides useful insights into the variability of 
firms’ contributions to aggregate productivity growth; this information is not available in ABS 
publications. We find similar productivity growth movements over time except in 2004-05, 
2009-10 and 2010-11 when we compare published ABS and our experimental results. The 
differences may be due to the fact that we use different prices to derive the volume measures. 
This introduces differences in relative prices when estimating firm productivity. These differences 
result in different substitution effects between labour and capital and between goods and services, 
which can lead to different results. (see Dumagan and Balk, 2016, and Duarte and Restuccia, 
2017, on the role of relative prices in estimating productivity.) In addition, we use firm-level 
capital cost instead of firm-level capital stock measures for our analysis. This is because there 
is no information on firm-level asset prices. This information is required to derive capital stock 


measures using the perpetual inventory method (Walters and Dippelsman, 1986). 
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Figure 7: All industry decomposition 


Griliches and Regev (1995) 


Melitz and Polanec (2015) 


02-03 04-05 O06_07 08_o9 10-11 12713 


MM Between {ff Exit 
— ABS Mfp — Experimental results (Enter GH Within 


Note. Between and Within are the contributions from surviving firms and Enter and Exit are the contributions from entering and 
exiting firms to the aggregate productivity indicated by Experimental results. The derivation of these measures can be found in (10) 
and (12) Griliches and Regev (1995), Melitz and Polanec (2015). ABS Mfp is the published ABS Estimates of Industry Multifactor 
Productivity (ABS, 2013). 


Like Nguyen and Hansell (2014), this study has found that the net contribution from entering 
and exiting firms is smaller in manufacturing than in services industries in general. The within- 
industry contribution component generally has a smaller contribution in services industries. This 
may imply that entering and exiting firms are the main source of productivity changes. At the 
industry level, our experimental results show similar patterns with the ABS results, particularly 
for the Agriculture, Forestry and Fishing (A), Construction (E), Financial and Insurance Services 
(K) and Administrative Service industries. The industries with notable differences are Mining 
(B) and Electricity, Gas and Water (D), especially in 2012-13. As discussed, the difference may 


be caused by different price deflators and the different methods used to derive capital measures. 
8.2 Firm level model results 


This study confirms the importance of correcting for endogeneity in estimating the firm-level 
production function. The estimated labour coefficients for the firm models’ wages (WAGES) 
are higher than InZ for all industries. Table 4 shows the estimated coefficients for the firm-level 
model results for All industries (using (6)) and Agriculture, Forestry and Fishing industry (using 
(4)). Appendix D contains results for all other industries). 


We show both estimated coefficients using Balanced and Imputed datasets. The first column 
under the Balanced and Imputed subheadings are the results from the instrumental variable 
and the second column contains the results from the ordinary least square. For All industries, 
the estimated coefficient of WAGES is 0.723 in balanced and 0.706 in imputed datasets (in 
ALL.OLS columns), stronger than the estimated instrumental variable, ing ih which is 0.691 in 
balanced and 0.648 in imputed datasets (in ALL.2SLS columns). Similarly, we observe similar 


industry results. Agriculture, Forestry and Fishing industry (A), the estimated coefficient of 
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WAGES is 0.212 in balanced and 0.248 in imputed datasets (in A.OLS columns), stronger than 
B. which is —0.036 in balanced and —0.016 in imputed 


datasets (in A.2SLS columns). The lower estimations of labour component are consistent with 


the estimated instrumental variable, ine 


similar studies using instrumental variables to correct for endogeneity (see Breunig and Wong, 
2008, and Levinsohn and Petrin, 2003). This correction is important to avoid bias in the 


aggregate industry decomposition results. 


Table 4: All industries (ALL) and Agriculture, Forestry and Fishing (A) industry results 


Balanced Imputed Balanced Imputed 


ALL.2SLS ALL.OLS ALL.2SLS ALL.OLS = A.2SLS A.OLS A.2SLS A.OLS 


ing) 0.691*** 0.648*** —0.036** —0.016*** 
(0.001) (0.0003) (0.018) (0.004) 
WAGES 0.723" 0.706*** 0.212" 0.248" 
(0.001) 0.0003) 0.002) 0.001) 
Lok 0.223"* o.251 0.246% 0.239% = 0.508%" ~=—0.447"** 0.453" = 0.402" 
(0.001) (0.001) (0.0002) 0.0002) 0.002) 0.002) 0.001) 0.001) 
LuM 0.242% 0.183" 0.238" 0.170" 0.114 0.069% 0.129%" 0.076" 
(0.0004) (0.0005) (0.0002) 0.0002) 0.001) 0.001) 0.001) 0.001) 
Firm_Age 0.120%* 0.054" = 0.159"* ~—0.060** 0.008  —0.061"*  0.022"*  —0.033*** 
0.001) 0.001) (0.0005) 0.0005) 0.005) 0.005) 0.003) 0.002) 
Year2003 0.170"* 0.328" 0.260% 0.445%" —0.148"* —0.094" —0.075"* —0.038"" 
0.003) 0.003) 0.002) 0.002) 0.014) 0.012) 0.007) 0.006) 
Year2013 1914" 0.107" ~—-1.700"" = 0.266*** = —0.275"*  —0.016  —0.099** 0.103*** 
0.004) 0.003) 0.002) 0.002) 0.063) 0.016) 0.015) 0.008) 
BAS _divB —0.416** —1.118"* —0.131"* —0.747"" 
0.014) 0.014) 0.006) 0.006) 
BAS divS  —0.472*** 1.022" _—0.439*** — —0.700*** 
0.003) 0.003) 0.002) 0.002) 


Observations 2,296,984 2,296,984 10,039,638 10,039,638 162,766 162,766 662,553 662,553 
Adjusted R? 0.992 0.992 0.990 0.990 0.363 0.399 0.357 0.405 


Note: *p<0.1; “*p<0.05; **p<0.01 


8.3 Worker level model results 


It is essential to include firms connected by workers to uniquely identify worker and firm effects 
(Abowd et al., 1999). Table 5 shows the pattern of workers who have different employers in 
the sample. The columns indicate the number of years that a worker stays in the sample and 
the rows correspond with the number of employers workers have over the 11 years of data. It 
is more likely for workers to work for more employers when they stay in the sample for longer. 
There are significant worker movements between firms in the sample. Only 23.27% of workers 


have one employer over the 11-year period 
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Table 5: Number of job changes and number of years in the sample 


number of years in sample 


MUD ET Ot 2 3 4 #5 6 7 +8 9 10 >10 Total 
employers 

1 3.54. 3 2.12 5.63 0.83 0.74 0.66 0.62 0.72 0.87 4.53 23.27 

2 1 2.17 1.79 3.69 1.07 0.97 0.86 0.8 0.88 1.03 4.57 18.82 

3 0.33.1 1.16 2.22 1.04 0.99 0.91 0.85 0.92 1.02 4.13 14.56 

4 0.12 0.46 0.61 1.26 0.81 0.87 0.84 0.81 0.88 0.95 35 11.11 

5 0.05 0.21 0.32 0.7 0.54 0.66 0.7 0.72 0.78 0.83 2.87 8.37 

6 0.02 0.1 0.16 0.39 0.34 0.46 0.54 0.59 0.66 0.69 2.29 6.24 

7 0.01 0.05 0.09 0.22 0.2 0.31 0.39 0.45 0.53 0.57 18 463 

8 - 0.02 0.05 0.12 0.12 0.2 0.28 0.34 041 044 14 34 

9 0.01 0.02 0.07 0.07 0.13 0.19 0.25 0.32 0.35 1.07 248 

10 - 0.01 0.01 0.04 0.04 0.08 0.13 0.18 0.24 0.26 0.82 1.81 

>10 - 0.01 0.02 0.07 0.07 0.15 0.28 0.44 0.66 0.81 281 5.31 

Total 5.07 7.04 6.36 14.39 5.14 5.57 5.78 6.03 6.99 7.83 29.81 100 


Note. Number of employers measures how many unique ABN a worker z has over the sample period and number of years in sample 
measures how many unique year counts a worker z has in the sample. 


Table 6 shows the correlation structure of the estimated components in the worker model. This 


study finds a positive correlation between worker and firm effects. This is in line with the finding 
of Iranzo et al. (2008) but different from Abowd et al. (2002). Andrews et al. (2008) suggest 


that the negative correlation in previous studies may arise from a lack of worker mobility, which 


is not the case in this Australian sample. 


Table 6: Pearson correlation coefficients of estimated components 


logL 0 w Xa € 
logL _ 0.3063*** 0.5490*** —0.2115*** 0.5923*** 
6  —0.3063*** — 0.1058*** —0.9793*** —0.0085*** 
wy  0.5490*** 0.1058*** — —0.0966*** —0.0021*** 


Xa -0.2115*** —0.9793*** —0.0966*** 


—0.0267*** 


€  0.5923*** 


Note. Prob > |r| under N = 130, 281,096. 
*n<0.1; **p<0.05; ***p<0.01. 


—0.0085*** —0.0021*** 


—0.0267*** 
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9 CONCLUSIONS AND FUTURE DIRECTIONS 


This study shows the value of using microdata to better understand the components of industry- 
level productivity growth. It explores methods for fitting a model for workers by solving a large 
sparse linear system of equations and uses the estimated results to correct for endogeneity in the 
firm’s decisions about how much labour to employ. The paper also calculates the contribution 


of entering, exiting and surviving firms to aggregate productivity at the industry level. 


Our results show the importance of correcting for endogeneity in estimating the production 
function. The productivity contributions from surviving, entering and exiting firms are quite 
different across different industries. Understanding these differences may be useful to inform 


policy. 


Across all industries, we generally find that firm exit is the most important contributor to 
productivity growth. Firm entry generally has a negative impact on industry-level productivity 
growth. This is similar to what was found by Breunig and Wong (2008) for the 1990s in 
Australia. It is not surprising, as many new firms end up not surviving. They may lack access 


to industry-specific knowledge and skills. 


Within-firm productivity increases are generally a positive contributor to industry-level pro- 
ductivity, but are very small in about half of the industry groups we examine. Re-allocation 
effects for continuing firms are virtually non-existent. Almost all of the reallocation is happening 


through entry and exit. 


This would suggest that policies which facilitate firm entry and exit are likely to help in achieving 
increased productivity gains. Policies which provide large advantages to incumbent firms (such 
as cumbersome regulation which is difficult to comply with for new entrants) are likely to detract 


from productivity growth. 


Our analysis could be extended in several ways. First, with a better proxy for worker skill 
such as education, we could better account for the effects of workers. Capturing workers’ skill 
dispersion across and between firms would be useful. Secondly, it would be interesting to explore 
other estimation approaches like Constant Elasticity of Substitution production functions that 
allow the elasticity of substitution between capital and labour inputs to better understand the 
relative prices effects (McFadden, 1963) and (Steenkamp, 2017) . Increased data access and 


better measures of key variables are both required for such analyses. 
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A  IMPUTATION METHODS FOR CATEGORICAL DATA 


We use the available information from the experimental dataset to allocate firm 7 belonging to 
an unknown industry U into different industries. The font—X—represents observed dataset in 


the notation. The formula to allocate firms into different industries is: 


exp (Xia 
Pr(j=k|X ju) = P(% ju ®) ae ee al 
jkt K-1 T 
1+ do CLP (DX jp, Ax) 


1 
eal 
1+ Yop C@P (Xj Ar) 


Prg=K| Uj) = 


The 1 terms in the denominator and in the numerator of Pr(j = K | Yj,,) ensure that prob- 
abilities over the response categories equal 1 (Czepiel, 2002, Agresti, 2007). It is convenient 
to write the term Xe Bk in Wilkinson and Rogers’s (1973) notation. The term X contains 
Firm__Age + Employees + 7 where Firm__Age is the age of firms and Employees is the num- 
ber of employees that firm 7 has. The variable 7 is represented by 10 time-indicator variables, 
one for each year with 2001-02 as the baseline. This makes each V jet 2k a sum of 12 terms. The 
formula is applied to the complete cases to obtain the industry coefficients a, with k = 1,---,17 
industries. We combine these estimated coefficients with firm characteristics data ,,, for 
firms with the missing industry. We allocate firm j to an industry with the highest predictive 
probability. 
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B IMPUTATION METHODS FOR CONTINUOUS DATA 


Next, we assume MAR and impute missing values in the combined ABS and IPGOD datasets by 
imputed industry. We use sequential regression in SAS proc mi procedure for the imputation. 
We adapt a similar notation to Reiter (2005). The experimental dataset consists of [y, 1], where 
y isan N x 1 vector that includes the dependent variable, and VY is an N x 15 matrix that 
includes all the independent variables from (3). This gives 15 unknown regression parameters 
in (3). We impute missing variables Iny, Ink and InM. The observed dataset consists of two 
N x 16 matrices, D = [y,X], where X includes all the independent variables from (3); and 
the response indicator matrix R which we use to partition D into the observed D°’® and the 
missing D™*, We use X, X® and X™) to denote the design matrix for imputing missing 


data in Iny, Ink and InM, respectively. 


We impute the missing values in Iny, Ink and InM separately using sequential regression (SR). 
The SR method uses appropriate regression models for different variable types. For example, 
continuous variables are imputed using a normal model and binary variables using a logit model. 
The SR method generates a continuous vector y*™ from the parameters directly estimated from 
the fitted regression following Raghunathan et al. (2001). The SR formula for generating missing 
data for y is: 


y=NXB. (19) 


We apply (19) three times, with y denoting each of the three variables Iny, nk and InM. 
We use ©, XH) and X™ to denote the design matrix for creating missing data in Iny, nk 
and InM, respectively. If the missing data variable is Iny, then VY includes all the independent 
variables from (3). In comparison, if the missing data variable is nk, then V (*) includes all 
the independent variables and Iny but excludes Ink. Similarly, if the missing data variable is 
InM, then X includes all the independent variables and Iny but excludes nM. Algorithm 3 
describes the basic concept of the algorithm (Drechsler, 2011). 


Algorithm 3 Sequential regression algorithm 


1: procedure 
a Step 1: draw a new value 6 = (a7, 8) from Pr(@| yoy.) 


a draw variance from 0?|X jy. ~ (Yors—X ore By’ (Yors —X onaB)Xn2 4» Where n is the total number 
of observations and k is the number of parameters 

4: draw coefficients from B | o?,X), ~ N(B, (AL eyo") 

5: Step 2: draw an imputed value y*? from Pr(y*@! | yoy,, 9) 

6: draw from fitted regression y**? | 8,07, X44, ~ N(X4,8,07) 

ts repeat Step 1 and Step 2 to impute each variable sequentially 


We create 10 imputed datasets in each imputed industry and we select the best imputed dataset 
which maximises the likelihood for equation (3) from the 10 datasets in each industry (Schomaker 
and Heumann, 2014, Chien et al., 2018). 
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C INDUSTRY DECOMPOSITION 


Figure 8: Industry Decomposition 
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Figure 9: Industry Decomposition 
(d) Transport, Postal and Warehousing 


(a) Accommodation and Food Services 


% Griliches and Regev (1995) 
14 
o4 
14 
-24 
5 Melitz and Polanec (2015) 
14 
o4 
-14 
-24 
02-03 04-05 06-07 08-09 10-11 1213 
Hi Between [J Exit 
— ABS Mfp (RHS) — Experimental results (LHS) © eEnter HE Within 
(b) Information and Telecommunications 
Griliches and Regev (1995) 
24 4 
14 r2 
04 LO 
-14 L_o 
Melitz and Polanec (2015) 
o4 
14 
o4 
14 
02703 04-05 06-07 08-09 10-11 12-13 
Ml Between [i Exit 
— ABS Mfp (RHS) — Experimental results (LHS) ©) enter BB within 
(c) Rental, Hiring and Real Estate Services 
Griliches and Regev (1995) 
r5 
14 
04 ro 
-14 (oe 
-24 
r—10 
-3 
Melitz and Polanec (2015) 
14 
o4 
14 
24 


02-03 04-05 06-07 08-09 10-11 12-13 
Hl Between [J 
— ABS Mfp (RHS) — Experimental results (LHS) ‘Enter || 


Exit 
Within 


Griliches and Regev (1995) 


Melitz and Polanec (2015) 


02-03 04-05 06-07 08-09 


10-11 12-13 


Mi Between [J exit 
— ABS Mfp(RHS) — Experimental results (LHS) enter i Within 
(e) Financial and Insurance Services 
Griliches and Regev (1995) 
So 45 
+4 
21 
2 
07 +O 
=) 
24 
Melitz and Polanec (2015) 
4 
2 
oO 
-2 


02-03 04-05 06-07 08-09 


— ABS Mfp (RHS) — Experimental results (LHS) 


10-11 12-13 


Hi Between [i Exit 
© Enter i Within 


(f) Professional and Technical Services 


Griliches and Regev (1995) 


So 24 


r5.0 


Melitz and Polanec (2015) 


02-03 04-05 06-07 08-09 


— ABSMfp(RHS) — Experimental results (LHS) 


10-11 12-13 


Mi Between [ff exit 
M eEnter Hi Within 


ABS - MICRO-DRIVERS OF AGGREGATE PRODUCTIVITY - 1351.0.55.164 


36 of 46 


Yo 


Yo 


Griliches and Regev (1995) 
24 
4. 
04 +o 
b—4 
24 ag 
r—-12 
Melitz and Polanec (2015) 
24 
oJ 
-24 
1 x r—-12 
02-03 04-05 06-07 08-09 10-11 12-13 
Hi Between [J Exit 
— ABS Mfp (RHS) — Experimental results (LHS) © Enter i Within 
(b) Education and Training 
Griliches and Regev (1995) 
a4 
oJ 
414 
-24 
34 
Melitz and Polanec (2015) 
14 
o4 
14 
-24 
_34 
02~03 04~05 06-07 08-09 10-14 12-13 
Mi Between [i exit 
© enter BH within — Experimental results 
(c) Arts and Recreation Services 
Griliches and Regev (1995) 6 
24 
14 
of 
14 
-24 
Melitz and Polanec (2015) . 
of 
14 
of 
14 
-24 
02-03 04-05 O6-—07 08-09 10-11 12-13 
Ml Between [i Exit 
— ABS Mfp (RHS) — Experimental results (LHS) M@ Enter HE Within 


Figure 10: Industry Decomposition 
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D FIRM MODEL RESULTS 


Table 7: Mining (B) and Manufacturing (C) industries results 


Balanced Imputed Balanced Imputed 
B.2SLS. B.OLS B.2SLS B.OLS C.2SLS C.OLS C.2SLS_ C.OLS 
ing”? 0.584*** 0.235*** 0.565*** 0.090*** 
(0.115) (0.014) (0.012) (0.003) 
WAGES 0.563*** 0.546*** on 0.469*** 
(0.017) (0.005) (0.002) (0.001) 
Luk Ogio -0 318". 0.226 062” 200" coda7: 0,170" 006" 
(0.011) (0.011) (0.004) (0.003) (0.002) (0.001) (0.001) — (0.001) 
LnM O19" 0.090". 0.203" 0,108" 0317 0.180": <G.352"". 0016 
(0.007) (0.007) (0.003) (0.003) (0.001) (0.001) (0.001) (0.001) 
InFirm_Age —0.012 0.016 0.002  —0.008 0.111*** 0.041** 0.135" 0.049*** 
(0.022) (0.019) (0.007) (0.006) (0.002) (0.002) (0.002) —_ (0.001) 
Observations 4,902 4,902 36,559 36,559 288,335 288,335 645,869 645,869 
R? 0.289 0.413 0.261 0.431 0.303 0.414 0.320 0.430 
Adjusted R2 0.287 s«0.411.——s«iw2G——(s«é-B~—siéiBB—*«iLA A ~Ss«.320———«0.4300 
Note: *p<0.1; *p<0.05; **p<0.01 


Table 8: Electricity, Gas, Water and Waste Services (D) and Construction (E) industries results 


Balanced Imputed Balanced Imputed 

D.28SLS. D.OLS D.2SLS D.OLS  E.2SLS E.OLS E.25LS E.OLS 
ng?) 0.409*** 0.114" 0.123*** 0025" 

(0.101) (0.015) (0.010) (0.002) 
WAGES 0.541*** 0.444** 0.403*** 0.355*** 

(0.016) (0.006) (0.002) (0.001) 

Lik 0.294*** 0.180" 0.3138"  0.232** 0.170" ~—-0.129"** 0.164*** 0.134** 

(0.011) (0.010) (0.004) (0.004) (0.001) (0.001) (0.001) (0.001) 
Ln 0.134*** 0.069*** 0.145" 0.084*** 0.229" —-0.169*** 0.245*°* 0.183*** 

(0.007) (0.007) (0.003) (0.003) (0.001) (0.001) (0.0005) (0.0005) 
InFirm_Age  0.079*** 0.016 0.122** 0.060*** 0.030*** =—0.012*** ~—-0.063*** 0.018*** 

(0.019) (0.016) (0.008) (0.007) (0.002) (0.002) (0.001) (0.001) 
Observations 5,022 5,022 28,837 28,837 373,859 = 373,859 ~=—-1,477,460 1,477,460 
Adjusted R? 0.234 0.382 0.269 0.387 0.222 0.313 0.248 0.336 


Note: 


*p<0.1; **p<0.05; **p<0.01 
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Table 9: Wholesale Trade (F) and Retail Trade (G) industries results 


Balanced Imputed Balanced Imputed 


F.2SLS  F.OLS F.2SLS  F.OLS G.2SLS G.OLS G.2SLS G.OLS 


Ingi?*) 0.570"** 0.110*** 0:622"** 0.162*** 
(0.017) (0.004) (0.010) (0.003) 
WAGES 0.493*"* 0.442*** 0.450°** 0.412*** 
(0.003 (0.002 (0.002 (0.001 
Luk 0.167*** 0.085"* 0.160"* 0.096" 0.139%" 0.088" —0.132"* ~——-0.090"** 


(0.002 (0.002 (0.001 (0.001 (0.001 (0.001 (0.001 (0.001 


Ln 0.336°* = 0.235" 0.363"* —-0.252*** —0.362"** — -0.246*** ~—-0.395*** 0.271** 
(0.002 (0.002 (0.001 (0.001 (0.001 (0.001 (0.001 (0.001 


InFirm_Age 0.112*** 0.052*** 0.138" 0.059%" 0.187" 0.111" 0.221"" ~——0.126** 
(0.003) (0.003) (0.002) (0.002) (0.002) (0.002) — (0.001 (0.001 


Observations 213,389 213,389 480,515 480,515 434,058 434,058 1,072,727 1,072,727 
Adjusted R? 0.254 0.330 0.292 0.370 0.241 0.310 0.273 0.345 


Note: *p<0.1; *p<0.05; ***p<0.01 


Table 10: Accommodation and Food Services (H) and Transport, Postal and Warehousing (I) industries 
results 


Balanced Imputed Balanced Imputed 


H.2SLS  H.OLS H.2SLS H.OLS — 1.2SLS LOLS L.2SLS LOLS 


Ingi?*) non ae pare 0.604" 0.086"** 
(0.013) (0.003) (0.032) (0.003) 
WAGES 0.447*** 0.420*** 0.417"** 0.424*** 
(0.002) (0.001) (0.005) (0.001) 
Luk O18" AA? OTe Oe’ “oP + ne =. 0078") ODIs 


(0.002) (0.002) (0.001) (0.001) (0.004) (0.003) (0.001) (0.001) 


LnM Osseo". 9.268" 0.452" 0.315" 0.132" D.091"" 0-105" 0.080" 
(0.002) (0.002) (0.001) (0.001) (0.002) (0.002) (0.001) (0.001) 


InFirm_Age 0.279*** 0.213*** 0.299%" 0.224** —0.103*** 0.029*** 0.170*** —0.097"** 
(0.002) (0.002) (0.001) (0.001) (0.007) (0.006) (0.002) —_ (0.002) 


Observations 258,373 258,873 721,244 721,244 45,677 45,677 463,843 463,843 
Adjusted R? 0.360 0.433 0.383 0.461 0.229 0.327 0.257 0.380 


Note: *D<0.1; *p<0.05; **p<0.01 
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Table 11: Telecommunications (J) and Financial and Insurance Services (K) industries results 


Balanced Imputed Balanced Imputed 
J.2SLS  J.OLS J.2SLS J.OLS K.2SLS K.OLS K.2SLS  K.OLS 
ing? Loo" 0.095*** 1.366" 0.081*** 
(0.061) (0.009) (0.053) (0.003) 
WAGES 0.593"** 0.566*** 0.592** 0.529*** 
(0.009) (0.003) (0.008) (0.001) 
Lik 0.2777" = 0.138" 0.239%" —-0.142*** 0.224*** = =0.092*** = 0.248*** ~—-0.139*** 
(0.007) (0.006) (0.003) (0.002) (0.006) (0.006) (0.001) (0.001) 
LnM 0.207" 0.116" 0.246** 0.1388°**  0.202***  0.1387"* =0.188"** —-0.1382*** 
(0.005) (0.005) (0.002) (0.002) (0.004) (0.004) (0.001) (0.001) 
InFirm_Age 0.056*** 0.047*** 0.144*** 0.050*** 0.090*** 0.092*** 0.124*** 0.072*** 
(0.013) (0.011) (0.005) (0.004) (0.012) (0.011) (0.002) (0.002) 
Observations 13,619 13,619 89,794 89,794 21,013 21,013 471,502 471,502 
Adjusted R? 0.267 0.430 0.266 0.461 0.237 0.384 0.242 0.435 
Note: *p<0.1; “p<0.05; **p<0.01 


Table 12: Rental, Hiring and Real Estate Services (L) and Professional Services (M) industries results 


Balanced Imputed Balanced Imputed 

L.2SLS L.OLS L.2SLS LOLS M.2SLS M.OLS M.2SLS M.OLS 
Ing? 1.663°* 0,025 Oar 0.156"** 

(0.036) (0.004) (0.019) (0.002) 
WAGES 0.546*** 0.404*** 0.607*** 0.577°** 

(0.006 (0.002 (0.003 (0.001) 

Lik 0.278*** 0.176*** 0.262*** 0.209%" 0.215** 0.104** = 0.152*** 0.098*** 

(0.004 (0.004 (0.001 (0.001 (0.002 (0.002 (0.001) (0.001) 
Ln 0.200*** 0.123°** 0.204*** 0.141" 0.144*"* -0.085*** —-0.165*** 0.089*** 

(0.003 (0.003 (0.001 (0.001 (0.002 (0.001 (0.0005) (0.0004) 
InFirm_Age 0.106*** 0.076*** 0.181°* 0.117***  0.027*** ~—-0.008** 0.064*** 0.009*** 

(0.008 (0.007 (0.002 (0.002 (0.004 (0.003 (0.001) (0.001) 
Observations 40,665 40,665 386,405 386,405 124,096 124,096 1,298,560 1,298,560 
R? 0.288 0.391 0.260 0.356 0.178 0.400 0.161 0.427 
Adjusted R? 0.288 0.391 0.260 0.355 0.178 0.400 0.161 0.427 


Note: 


*p<0.1; *“p<0.05; ***p<0.01 
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Table 13: Administrative and Support Services (N) and Public Administration and Safety (O) industries 


results 
Balanced Imputed Balanced Imputed 
N.2SLS. N.OLS N.2SLS N.OLS O.2SLS O.OLS O.2SLS O.OLS 
Ingi*) 0.876*** 0.174*** 0.241*** 0.118*** 
(0.036) (0.004) (0.079) (0.012) 
WAGES 0.576"** 0.549*** 0.581*** 0.563*** 
(0.005) (0.001) (0.013) (0.004) 
Lik 0.246*** 0.141*** 0.222*** 0.136" 0.227" = 0.1382*** 0.224" = -0.145*** 
(0.004) (0.003) (0.001) (0.001) (0.009) (0.008) (0.003) (0.003) 
LnM 0.130*** 0.069*** 0.143*** 0.073*** = 0.179*** =0.105*** = -0.255*** = 0.148*** 
(0.002) (0.002) (0.001) (0.001) (0.007) (0.006) (0.003) (0.002) 
InFirm_ Age 0.044*°* 0.003 0.116** 0.043*** 0.043*** —0.018 0.093*** 0.025*** 
(0.007) (0.006) (0.002) (0.002) (0.015) (0.013) (0.006) (0.005) 
Observations 43,106 43,106 441,659 441,659 6,653 6,653 56,044 56,044 
Adjusted R? 0.246 0.410 0.261 0.451 0.276 0.448 0.375 0.550 
Note: *p<0.1; **p<0.05; ***p<0.01 


Table 14: Education and Training (P) and Public Administration and Safety (Q) industries results 
Balanced Imputed Balanced Imputed 
P.2SLS P.OLS ~~ P.2SLS POLS Q.2SLS Q.OLS Q.2SLS Q.OLS 
Ingi?*) fails 0.184" 0.756" 0.228" 
(0.067) (0.008) (0.038) (0.004) 
WAGES 0.592*** 0.555*** 0.593*** 0.583*** 
(0.010) (0.002) (0.006) (0.001) 
Lik 0.218*** 0.096*** 0.195*** 0.110** 0.237*** 0.1277" 0.120***  0.069*** 
(0.008) (0.007) (0.002) (0.002) (0.005) (0.004) (0.001) (0.001) 
LnM 0.203*** 0.108*** = 0.265*** —-0.155*** —-0.125*** -0.078*** =0.186*** =—-0.111*** 
(0.005) (0.005) (0.001) (0.001) (0.003) (0.003) (0.001) (0.001) 
InFirm_Age  0.087*** 0.037*** 0.168*** 0.041*** 0.114***  0.025*** = 0.202*** = 0.012*** 
(0.014) (0.012) (0.004) (0.004) (0.007) (0.006) (0.002) (0.002) 
Observations 10,478 10,478 181,655 181,655 35,078 35,078 629,550 629,550 
Adjusted R? 0.311 0.481 0.297 0.485 0.181 0.351 0.153 0.358 
Note: *p<0.1; **p<0.05; ***p<0.01 
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Table 15: Arts and Recreation Services (R) and Other Services (S) industries results 


Balanced Imputed Balanced Imputed 
R.2SLS  R.OLS R.2SLS R.OLS S.2SLS  S.OLS S.2SLS  S.OLS 
Ingl?*) 1.093*** 0.154*** O.24n" 0.049** 
(0.052) (0.008) (0.013) (0.003) 
WAGES 0.589*** 0.558*** 0.498*** 0.459*** 
(0.008) (0.002) (0.003) (0.001) 
Lik 0.208*** 0.113°* -0.148***-0.065*** = -0.158*** 0.101*"** 0.144" —-0.106*** 
(0.005) (0.005) (0.002) (0.002) (0.002) (0.002) (0.001) (0.001) 
LnM 0.234") = 0.152** — -0.288*"* —-0.192*** —-0.279"** 0.179"* 0.3848" 0.229*** 
(0.004) (0.004) (0.002) (0.001) (0.001) (0.001) (0.001) (0.001) 
InFirm_Age 0.139** 0.073*** 0.182***  0.062*** = 0.100*** = 0.035*** = -0.1382** —0.054*** 
(0.010) (0.009) (0.004) (0.004) (0.003) (0.002) (0.002) (0.001) 
Observations 19,502 19,502 146,098 146,098 196,393 196,393 748,764 748,764 
Adjusted R? 0.316 0.462 0.290 0.478 0.267 0.383 0.315 0.427 


Note: 


*D<0.1; *p<0.05; 


*"*5<0.01 
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E SUMMARY STATISTICS 


Table 16: Summary statistics: firm-level productivity model data 


Statistic N Pryst Prsoth Pager St. Dev. 
Balanced data 

Iny ir” 2,296,984 7.28 10.57 13.34 1.14 
nz”? 2,296,984 6.06 7.49 9.70 1.06 
IK pp 2,296,984 5.01 8.88 11.60 1.27 
nM jp 2,296,984 5.40 10.59 14.11 1.74 
InFirm_Age 2,296,984 0.00 1.79 2.94 0.79 
WAGES 2,296,984 6.53 9.82 11.90 0.99 
Imputed data 

Lage 10,039,638 7.03 10.55 13.33 1.24 
nz”? 10,039,638 6.03 7.38 9.77 1.08 
IK pp 10,039,638 4.41 8.69 11.76 1.48 
nM jp 10,039,638 5.06 10.18 14.41 1.94 
InFirm__Age 10,039,638 0.00 1.79 2.94 0.83 
WAGES 10,039,638 6.30 9.75 12.03 1.13 
ing is logarithm of output (i.e., sales adjusted for 


repurchase of stock) deflated by industry gross value added 
implicit price deflators. 

InzV *) the logarithm of estimated labour inputs. 

Ink ;,,¢ is the logarithm of capital that includes deprecia- 
tion, capital rental expenses and capital work deductions 
deflated by the industry consumption of fixed capital im- 
plicit price deflators. 

InM ;,, is the logarithm of material costs deflated by Pro- 
ducer Price Indexes: Intermediate Goods (ABS, 2018b). 
InFirm_Age is the logarithm of firm age. Firm age is 
derived as the current year minus the year of incorpora- 
tion. 

InWAGES is the logarithm of wage costs (reported in 
Business Activities Statements) deflated by Wage Price 
Index: All Industries. 


ABS - MICRO-DRIVERS OF AGGREGATE PRODUCTIVITY - 1351.0.55.164 43 of 46 


Table 17: Summary statistics: worker equation 


Statistic N Mean St. Dev. Min’ Max 
SKILLH — 130,281,096 0.31 0 0 1 
SKILLHM 130,281,096 0.11 0 0 1 
SKILIM — 130,281,096 0.12 0 0 1 
2003 130,281,096 0.07 0 0 1 
2004 130,281,096 0.07 0 0 1 
2005 130,281,096 0.07 0 0 1 
2006 130,281,096 0.07 0 0 1 
2007 130,281,096 0.08 0 0 1 
2008 130,281,096 0.08 0 0 1 
2009 130,281,096 0.08 0 0 1 
2010 130,281,096 0.12 0 0 1 
2011 130,281,096 0.11 0 0 1 
2012 130,281,096 0.10 0 0 1 
2013 130,281,096 0.09 0 0 1 
AGE 130,281,096 37 37 17 64 
AGE? 130,281,096 1549 1369 289 4096 
AGE? 130,281,096 70029 50653 4913 262144 
AGE* 130,281,0963370501 1874161 8352116777216 


SEX : AGE 130,281,096 19 18 OO 64 

SEX : AGE?130,281,096 792 324 4096 

SEX : AGE*130,281,096 35825 5832 262144 
SEX : AGE*130,281,0961726528 104976 16777216 
SEX : 2003 130,281,096 0.03 
SEX : 2004 130,281,096 0.04 
SEX : 2005 130,281,096 0.04 
SEX : 2006 130,281,096 0.04 
SEX : 2007 130,281,096 0.04 
SEX : 2008 130,281,096 0.04 
SEX : 2009 130,281,096 0.04 
SEX : 2010 130,281,096 0.06 
SEX :2011 130,281,096 0.06 
SEX : 2012 130,281,096 0.05 
SEX : 2013 130,281,096 0.05 


The indicator variable High Skill (SKILLA) equals 1 if a 
worker has at least a tertiary qualification and 0 otherwise. 
The indicator variable Medium Skill (SKILLHM) equals 1 
if a worker has at most a diploma qualification and O other- 
wise. 

The indicator variable Working Skill (SKILLIM) equals 1 
if a worker has at most a Certificate III qualification and 0 
otherwise. 

There are 11 time indicator variables from 2003 to 2013, 
one for each year. Note that 2003 represents financial year 
2002-03. 

AGE is the logarithm of worker age. Worker age is derived 
as the current year minus the year of birth, AGE?, AGE? 
and AGE* are worker age in quadratic, cubic and quartic. 
SEX : AGE, SEX : AGE?, SEX : AGE® and SEX : 
AG E* are the interaction terms between worker sex (SEX) 
and polynomial AGE. 

SEX : 2003, ---, SEX : 2013 are the interaction terms be- 
tween SEX and time indicator variables. 
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F FIRM ENTRY AND EXIT RATES 


We follow Nguyen and Hansell (2014) and define firm entry rate as the number of new firms 
divided by the total number of incumbent and entering firms in a given year. Exit rate is defined 
as the number of firms exiting the market given a year and divided by the incumbents in the 


previous year. 


Figure 11: Firm entry and exit rates 


(a) years from 2002—03 to 2004—05 (c) years from 2005—06 to 2007—08 
2003 2004 2005 2006 2007 2008 
Sother al a S other mm Zl 
Rrecreaton kz x R recreation | ial il _ 
Qhealth [i Zl | Qhealth | =z mm 
Peducation [i a x= P education | | | 
Opubic Co = O public [a zl = 
Nadninistration ia _——_ = N administration | Ce mz 
Mprofessional 9 kX ~—M | M professional mm Zl _ 
Let a Lrental [i Zl | 
Kfnarco Zl = Kfnarce i SEU 
Jtelecon a | J telecom _ tl z_ 
Itransport kk xl Itransport [a iW m7 
Haccommodation i! _——YX_ EE accommodation «=«§ kk =6=—h | 
Gretil _ _ = G retail | Ce 7 
Fwholesale i Il z= F wholesale mm _ _ 
E construction i —k— x = =€=—h a E construction Zz —_ —l | 
Delectricity = kk 6h = Delectricity [i lh _ 
C manufacturing a , | | C manufacturing | | i 
Bmining Xl | Bminng zl = 
All* allindusty a _ All allindustry [zi Zl | 
Aagicutuve Xx €=—h | A agriculture | Ul | 
201001020 20 10 0 10 20 40 30 20 10 0 10 20 20 10 0 10 10 0 10 10 0 10 20 30 40 50 
| entry O exit | entry | exit 
(b) years from 2008—09 to 2010—11 (d) years from 2011—12 to 2012—13 
2009 2010 2011 2012 2013 
Sother i a a S other re =U 
Rrecreation [i Ce a R recreation = =U 
Qhealth ii a a Qhealth = _ UU 
P education [i Ce rl P education = =U 
Opubic a a O public = | 
Nadministration i Pe kX =U N administration rl | 
Mprofessional mmm as Xx M professional | x_CULULUU 
Lrental aaa kz L rental =U | 
Kfnance Ce K finance | SLU 
Jtelecon i Ce _ xX Jtelecon Ty _— XS °° °°» | 
ltransport sl _ Xx J | transport sl x | 
Haccommodation i EES EH accommodation | =U 
Gretal fee kX 6 G retail sD EU 
Fwholesale [i a re F wholesale hla | 
Econstruction i a re Econstucton Ty a 
Delectricity | tC yt Ud D electricity z= tr 
C manufacturing | | tl x= C manufacturing | _ 
Bnining a a Citing =U 
All** allindustry i a re All** all industry a =U 
Aagriculture a a CC A agriculture x= sO 
fo 0 1030 2 10 0 10 0 10 10 0 10 10 0 10 
im entry i exit B entry O exit 


Note. ALL** represents all industries. 
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